feat(template): replacing template-zero with source-based template (#321)#336
feat(template): replacing template-zero with source-based template (#321)#336guzmud wants to merge 25 commits into
Conversation
113bce7 to
cc0c717
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #336 +/- ##
===========================================
- Coverage 82.43% 69.00% -13.43%
===========================================
Files 41 139 +98
Lines 1674 7076 +5402
===========================================
+ Hits 1380 4883 +3503
- Misses 294 2193 +1899
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
f8264b6 to
02d5277
Compare
02d5277 to
4cd73ec
Compare
4cd73ec to
1caa57c
Compare
There was a problem hiding this comment.
Pull request overview
Introduces a new collector template that models collectors as a generic engine fed by an implementer-provided source (data fetcher + source data + signatures), replacing the prior “collector vs services” shape with a “collector vs source” split.
Changes:
- Adds a reusable collector framework (engine, protocols, models, resilient uploaders) plus a minimal runnable template collector entrypoint.
- Adds configuration system based on Pydantic Settings (YAML / .env / env var sources) and docker/packaging scaffolding.
- Adds a unittest suite covering protocols, models, engine behavior, and uploaders.
Reviewed changes
Copilot reviewed 50 out of 65 changed files in this pull request and generated 23 comments.
Show a summary per file
| File | Description |
|---|---|
| template/.dockerignore | Docker build context exclusions for config/artifacts/caches. |
| template/.gitignore | Git ignores for config/artifacts/caches. |
| template/CONTRIBUTING.md | Contributor docs for the template (currently mismatched with new layout). |
| template/Dockerfile | Container build/run for the template collector. |
| template/README.md | Template usage/configuration documentation (currently references old layout). |
| template/docker-compose.yml | Example compose service for running the template collector. |
| template/manifest-metadata.json | Collector manifest metadata for the template image. |
| template/pyproject.toml | Poetry/PEP621 config, deps/extras, dev tooling config, entrypoint script. |
| template/src/init.py | Package exports for ConfigLoader. |
| template/src/main.py | Module entrypoint calling main(). |
| template/src/config.yml.sample | Sample YAML config for the template. |
| template/src/img/template-logo.png | Collector icon asset placeholder. |
| template/src/models/init.py | Exports ConfigLoader. |
| template/src/models/settings/init.py | Exports settings model components. |
| template/src/models/settings/base_settings.py | Base Pydantic Settings config for nested env parsing and immutability. |
| template/src/models/settings/collector_configs.py | OpenAEV + collector settings models. |
| template/src/models/settings/config_loader.py | ConfigLoader that merges sources and flattens config for daemon. |
| template/src/models/settings/template_configs.py | Template-specific settings (key, time window, batch size). |
| template/src/py.typed | Marks package as typed (PEP 561). |
| template/src/source/init.py | Source package marker. |
| template/src/source/template_data_fetcher.py | Placeholder data fetcher implementation. |
| template/src/source/template_signatures.py | Placeholder supported signature list. |
| template/src/source/template_source_data.py | Placeholder source data model implementation. |
| template/src/template_collector.py | Runnable example that wires Source + BaseCollector and starts it. |
| template/src/collector/init.py | Collector package marker. |
| template/src/collector/collector.py | BaseCollector daemon wrapper that instantiates/configures the engine. |
| template/src/collector/engines/init.py | Engines package marker. |
| template/src/collector/engines/basic.py | Generic processing engine (fetch/filter/process expectations, upload results/traces). |
| template/src/collector/helpers/init.py | Helpers package marker. |
| template/src/collector/internals/init.py | Internals package marker. |
| template/src/collector/internals/oaev_uploaders.py | OpenAEV expectation/trace uploaders built on resilient uploader. |
| template/src/collector/internals/resilient_uploader.py | Generic bulk uploader with fallback to individual uploads. |
| template/src/collector/models/init.py | Models package marker. |
| template/src/collector/models/data.py | OAEVData and TraceData models. |
| template/src/collector/models/exception.py | Custom exception hierarchy. |
| template/src/collector/models/expectations.py | Expectation result/trace/summary models and formatting helpers. |
| template/src/collector/models/source.py | Source definition and default SourceHandler implementation. |
| template/src/collector/protocols/init.py | Protocols package marker. |
| template/src/collector/protocols/data_fetcher.py | Protocol for data fetchers. |
| template/src/collector/protocols/engine.py | Protocol for collector engines. |
| template/src/collector/protocols/source_data.py | Protocol for source data models. |
| template/src/collector/protocols/source_handler.py | Protocol for source handlers. |
| template/src/collector/types/init.py | Types package marker. |
| template/src/collector/types/collector.py | Type aliases for collector structures. |
| template/src/collector/types/internals.py | Type aliases for resilient uploader injection points. |
| template/src/collector/utils/init.py | Utils package marker. |
| template/src/collector/utils/retroport_itertools.py | Python 3.11-compatible batched() helper. |
| template/tests/init.py | Tests package marker. |
| template/tests/test_template_collector.py | Tests wiring of template_collector.main(). |
| template/tests/collector/init.py | Tests subpackage marker. |
| template/tests/collector/test_collector.py | Tests BaseCollector initialization/setup behaviors. |
| template/tests/collector/engines/init.py | Tests engines package marker. |
| template/tests/collector/engines/test_basic.py | Tests BasicCollectorEngine behaviors and flow. |
| template/tests/collector/internals/init.py | Tests internals package marker. |
| template/tests/collector/internals/test_oaev_uploaders.py | Tests expectation/trace uploader behavior. |
| template/tests/collector/internals/test_resilient_uploader.py | Tests resilient uploader bulk+fallback behavior. |
| template/tests/collector/models/init.py | Tests models package marker. |
| template/tests/collector/models/test_data.py | Tests OAEVData/TraceData validation and formatting. |
| template/tests/collector/models/test_expectations.py | Tests expectation models behavior. |
| template/tests/collector/models/test_source.py | Tests Source and SourceHandler behavior. |
| template/tests/collector/protocols/test_data_fetcher.py | Tests protocol conformance for data fetchers. |
| template/tests/collector/protocols/test_engine.py | Tests protocol conformance for engines. |
| template/tests/collector/protocols/test_source_data.py | Tests protocol conformance for source data. |
| template/tests/collector/protocols/test_source_handler.py | Tests protocol conformance for source handlers. |
| template/tests/collector/utils/init.py | Tests utils package marker. |
| template/tests/collector/utils/test_retroport_itertools.py | Tests retroported batched() selection/behavior. |
Comments suppressed due to low confidence (2)
template/src/collector/types/collector.py:4
SignatureGroupsis defined aslist[dict[str, str]], butget_expectation_signature_groups()andmatch_signature_groups_and_oaevdata()treatsignature_groupsas a mapping (.items()) from signature type to a list of signature dicts. This mismatch will either break typing (mypy) or lead implementers of the protocol to return the wrong shape. UpdateSignatureGroupsto the actual structure used (e.g.,dict[str, list[dict[str, str]]]) and align protocol/implementations accordingly.
from typing import TypeAlias
SignatureGroups: TypeAlias = list[dict[str, str]]
template/pyproject.toml:128
[tool.cmw] icon-pathpoints tosrc/img/change-me-logo.png, but this template shipssrc/img/template-logo.pnginstead. This will break any tooling that relies ontool.cmwmetadata for the icon. Updateicon-path(or rename the asset) so the referenced file exists.
[tool.cmw]
install-command = "poetry install --extras local"
config-dump-command = "poetry run python -m src --dump-config-schema"
icon-path = "src/img/change-me-logo.png"
| - `--extra current`: Get pyoaev from Git release/current branch | ||
| - `--extra local`: Get pyoaev locally from `../../client-python` | ||
|
|
||
| ### Development Installation | ||
|
|
||
| ```bash | ||
| # Development setup with current pyoaev version | ||
| poetry install -E current --with dev,test |
| RUN if [[ ${PYOAEV_GIT_BRANCH_OVERRIDE} ]] ; then \ | ||
| echo "Forcing specific version of client-python" && \ | ||
| apk add --no-cache git && \ | ||
| pip install pip3-autoremove && \ | ||
| pip-autoremove pyoaev -y && \ |
b3dc46b to
012d176
Compare
6c1c2d6 to
c6eb06a
Compare
…r and datafetcher
b61ba00 to
fb2a085
Compare
| ] | ||
|
|
||
| HttpUrlToString = Annotated[HttpUrl, PlainSerializer(str, return_type=str)] | ||
| TimedeltaInSeconds = Annotated[ |
There was a problem hiding this comment.
@Kakudou it seems TimedeltaInSeconds as never been really used (it exists in palo alto XDR, sentinelone and splunk-es but never really used in any case): should be replace the type of period with it or just delete it ? (I guess you were the one that wrote it in the first place in splunk-es, I don't know if it's a leftover to be delete or a WIP unfinished)
Proposed changes
collectorvsservicessplit with acollectorvssourcesplit (cf. internal documentation for now)customname for the custom configuration (before it was the$name-of-your-collectorconfiguration, changing for each)Testing Instructions
Related issues
Checklist
Further comments
Nota bene: this branch was forked from #320 (hence the commits related to template zero)
Executive summary from the internal documentation